*Estimating Time-Varying Effects of Efficacy in a Cox Model

Gianfranco Di Gennaro

Join Date: Oct 2020
Posts: 144

*Estimating Time-Varying Effects of Efficacy in a Cox Model

25 Mar 2025, 11:13

Dear all,

I am analyzing a multi-record survival dataset where the outcome is time to treatment discontinuation (variable AED_4_trattamenti, with three drug categories). My model includes two key covariates that I hypothesize may act as mediators:

dicOutcome (drug responder vs. non-responder, a proxy for efficacy).
EventiYN (presence/absence of adverse events).
To account for potential time-varying effects, I included interactions between dicOutcome, EventiYN, efficacy, treatments, and the natural log of time (ln_time).

My code is like this:

Code:

stset followup [pweight=ipws], id(id) failure(OutYN==0)

gen lntime = ln(_t)
*using AED_4_trattamenti==1 as reference
gen trt_bri = AED_4_trattamenti == 0
gen trt_per = AED_4_trattamenti == 2

gen trt_bri_lntime = trt_bri * lntime
gen trt_per_lntime = trt_per * lntime

gen eventi_lntime = EventiYN * lntime
gen dicOut_lntime = dicOutcome * lntime

stcox ib1.AED_4_trattamenti i.EventiYN i.dicOutcome trt_bri_lntime trt_per_lntime eventi_lntime dicOut_lntime

Results:

Code:

. stcox ib1.AED_4_trattamenti i.EventiYN i.dicOutcome trt_bri_lntime trt_per_lntime eventi_lntime dicOut_lntime

        Failure _d: OutYN==0
  Analysis time _t: followup
       ID variable: id
            Weight: [pweight=ipws]

(sum of wgt is 2,416.93711814284)
Iteration 0:   log pseudolikelihood = -1039.5131
Iteration 1:   log pseudolikelihood = -967.11543
Iteration 2:   log pseudolikelihood =  -960.4047
Iteration 3:   log pseudolikelihood = -960.33864
Iteration 4:   log pseudolikelihood =  -960.3386
Refining estimates:
Iteration 0:   log pseudolikelihood =  -960.3386

Cox regression with Breslow method for ties

No. of subjects =         779                           Number of obs =  2,443
No. of failures =         161
Time at risk    = 19,445.4612
                                                        Wald chi2(8)  =  95.40
Log pseudolikelihood = -960.3386                        Prob > chi2   = 0.0000

                                        (Std. err. adjusted for 791 clusters in id)
-----------------------------------------------------------------------------------
                  |               Robust
               _t | Haz. ratio   std. err.      z    P>|z|     [95% conf. interval]
------------------+----------------------------------------------------------------
AED_4_trattamenti |
    Brivaracetam  |   122.0669    215.523     2.72   0.007      3.83446    3885.897
      Perampanel  |   265.4937   438.3501     3.38   0.001     10.43893     6752.31
                  |
       1.EventiYN |   12.74878   11.09015     2.93   0.003     2.317445    70.13385
     1.dicOutcome |   9.365561   9.098036     2.30   0.021     1.395233    62.86675
   trt_bri_lntime |   .1900911   .1123238    -2.81   0.005     .0597023    .6052461
   trt_per_lntime |   .1401045   .0770955    -3.57   0.000     .0476497    .4119495
    eventi_lntime |    .459382    .171476    -2.08   0.037     .2210249    .9547873
    dicOut_lntime |   .2110541    .091465    -3.59   0.000     .0902623    .4934931
-----------------------------------------------------------------------------------

I want to estimate how the effect of efficacy (dicOutcome) changes over time (e.g., at 9, 12, 18, and 24 months). However, I’m unsure how to obtain these time-specific hazard ratios.

Margins after survival analysis are tricky (at least for me...).

Can anyone help?

Thank you very much in advance.

Tags: cox model, survival analysis, Time-Varying Effects

Clyde Schechter

Join Date: Apr 2014

Posts: 30356
#2

25 Mar 2025, 12:44

You won't be able to use -margins- with these homebrew interaction terms. You have to use factor variable notation in the -stcox- command. If you are not familiar with factor variable notation, read -help fvvarlist- for details. So, something like:

Code:

stcox (ib1.AED_4_trattamenti i.EventiYN i.dicOutcome)#c.lntime ib1.AED_4_trattamenti i.EventiYN i.dicOutcome foreach n of numlist 9 12 18 24 { local ln_t_`n' = ln(`n') } margins, at(lntime = (`ln_t_9' `ln_t_12' `ln_t_18' `ln_t_24')) dydx(dicOutcome)

should give you the marginal effects of dicOutcome on the hazard ratio at each of those times. If you truly want to compare them all, use the -pwcompare- option in the -margins- command.

Note: You say you wish to model EventiYN and dicOutcome as mediators. But in saying

I want to estimate how the effect of efficacy (dicOutcome) changes over time (e.g., at 9, 12, 18, and 24 months).

and in coding interactions involving the dicOutcome variable you are (implicitly) saying you want to treat dicOutcome as a moderator (aka effect-modifier). Also, you don't state what other effect you want to model dicOutcome as a mediator of. So I think you are confusing mediation and moderation, and really want dicOutcome as a moderator.

If you really want to analyze dicOutcome as a mediator, I don't know how to do that. I have never seen anyone analyze mediators in survival analysis, and I don't know if it has been worked out in any software. If somebody following along does know about mediation in survival analysis, I hope he or she will chime in.
Comment
Gianfranco Di Gennaro

Join Date: Oct 2020

Posts: 144
#3

25 Mar 2025, 13:50

Very strange, Clyde. The model you proposed doesn't converge ("flat region resulting in a missing likelihood").
In my mind it is mathematically equivalent to the one I coded !

I do understand what you mean. In any case, in my situation, both "dicOutcome" and "EventiYN" are part of the causal chain:

drug → EventiYN (dicOutcome) → drug discontinuation.

Therefore, in my view, they don't become moderators simply because they interact with time. Rather, the mediation effect is dynamic (time-dependent).
Am I wrong? i don't want to talk nonsense

I know that if I wanted to formally quantify this mediation, I'd need to use complex techniques (e.g., parametric G-formula, etc.).

However, even just by comparing the total treatment effect using:

Code:

stcox ib1.AED_4_trattamenti

and then contrasting it with the residual effect after adding EventiYN and dicOutcome to the model:

Code:

stcox ib1.AED_4_trattamenti i.EventiYN i.dicOutcome

I think we can get an idea of how much of the effect is absorbed by efficacy and adverse events."
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30356
#4

25 Mar 2025, 14:56

In any case, in my situation, both "dicOutcome" and "EventiYN" are part of the causal chain:

drug → EventiYN (dicOutcome) → drug discontinuation.

is, indeed, an expression of mediation. But the -stcox- commands shown in this thread do not in any way express this chain of causality.

Therefore, in my view, they don't become moderators simply because they interact with time.

You are correct. The interaction with time simply allows their effects to vary over time. I retract what I said regarding moderation earlier: I was thinking that there were also interactions among drug, EventiYN, and dicOutcome--but there are not: they just each separately interact with lntime. I apologize for my confusion and for whatever confusion I caused you to have.

I'm somewhat surprised that the -stcox- with factor variable notation didn't converge, whereas your original non-fv version of it did. They are, as you note, equivalent models. Perhaps the two models' iterations were initialized differently, and the one with factor variable notation started out in an area where you cannot reach the maximum of the likelihood function due to an intervening flat region.

Here's another way to do it. Since you are not trying to interact drug, EventiYN, and dicOutcome with each other, you can get interaction with ln(_t) as follows:

Code:

stcox ib1.AED_4_trattamenti i.EventiYN i.dicOutcome, tvc(ib1.AED_4_trattamenti i.EventiYN i.dicOutcome) texp(ln(_t))

Perhaps this will converge.

All of that said, we still do not have a mediation model here. There is nothing in this code that captures even an association, let alone causality, from treatment to EventiYN or dicOutcome. Yes, it is true, that the addition of these intervening variables to a model without them will result in a decrease (actually, for hazard ratio, shift toward 1) in the treatment hazard ratio. But the same could be true of other variables that have only non-causal links to treatment and outcome, i.e. purely confounders.* This approach is very weak tea for establishing mediation. The whole field of mediation is a complicated one, and I do not pretend to be an expert in it. Suffice it to say the entire concept raises conceptual difficulties. And, on top of that, with non-linear models like -stcox-, even just calculating direct and indirect effects is a mess: with linear models you can at least build the indirect paths into the model by using SEM. While this doesn't establish causality per se, it does enable you to estimate what the causal effect might be if the relationship is, in fact, causal. With some non-linear models, you can move to -gsem- and build pathways. But that is not entirely satisfactory because it is unclear how to interpret the effect estimates derived from such a model.

*If your data are from a study in which treatment assignment was randomized, then you can at least hope to establish partial causality: from treatment to EventiYN and to dicOutcome. But that still doesn't establish a causal link between those putative-mediators and the outcome--and, of course, these particular mediators are inherently non-randomizable.

Again, I'm not an expert in mediation models, and if somebody following along has better advice than my, admittedly somewhat nihilistic, perspective, please do speak up.
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 461
#5

25 Mar 2025, 16:05

It is difficult, as Clyde says, but folks have been working on this for a decade plus (VanderWeele, 2011). There are some Stata-specific routines for doing mediation with a survival outcome. See the following:
Med4way: a Stata command to investigate mediating and interactive mechanisms using the four-way effect decomposition | International Journal of Epidemiology | Oxford Academic
Mediation analysis with survival data | Paul W Dickman
1 like
Comment
Gianfranco Di Gennaro

Join Date: Oct 2020

Posts: 144
#6

25 Mar 2025, 17:45

Thank you Clyde Schechter
I ran your code with tvc

Code:

stcox ib1.AED_4_trattamenti i.EventiYN i.dicOutcome, tvc(ib1.AED_4_trattamenti i.EventiYN i.dicOutcome) texp(ln(_t))

It gives me pretty dufferent results and maybe (I repeat: maybe) I understand why. The tvc() approach separates effects into fixed and time-varying components. I compared this modelto the manual interaction model, it performs worse in this case (just evaluating the Akaike and Bayesian Crieria).

Here remains the questiuon: if we do not want to call EventiYN and dicOutcome mediators (or "potential mediators"), what do we want to call them? Are they simple covariates? Actually, even if I have not conducted a formal mediation model, from a theoretical point of view, mediation is at least plausible. By the way, I have tried the Baron and Kenny framework (which I know is limited, especially in a time-to-event context) and the indications of mediation would be there.
I would ask you here to give me your opinion because frankly I struggle to attribute a role to EventiYN and dicOutcome.

I tried two other analytical approaches using a parametric gamma model, which is the distribution that gives me the best fit:

Code:

streg ib1.AED_4_trattamenti i.dicOutcome i.EventiYN, distribution(ggamma) tr streg ib1.AED_4_trattamenti ib1.AED_4_trattamenti i.EventiYN i.dicOutcome trt_bri_lntime trt_per_lntime eventi_lntime dicOut_lntime, distribution(ggamma) tr

and a joint model, to capture the dependence of the dynamics of the two longitudinal outcomes (dicOutcome and EventiYN) and the time-to-event outcome

Code:

gsem (EventiYN <- ib1.AED_4_trattamenti followup U1[id]@1, family(bernoulli) link (logit)) (dicOutcome <- ib1.AED_4_trattamenti followup U1[id]@1, family(bernoulli) link (logit)) (_t <- i.AED_4_trattamenti U1[id]@gamma, family(gamma, failure(_d) aft)), latent(U1) pweigh(ipws) covstructure(U1[id], unstructured)

Both with the models above, and with those enriched by manual interactions, I have more or less concordant estimates
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30356
#7

25 Mar 2025, 20:18

I would call these variables additional predictors of time to discontinuation of treatment.

How do you know which model gives you the best fit. It can be treacherous using things like AIC, BIC, or, for that matter, any other log-likelihood based statistic to compare different parameterization models. The reason is that the likelihoods themselves usually have some normalizing constant factors (e.g. 1/sqrt(2*_pi) in the normal likelihood). Often, to speed up computation, and relying on the fact that only differences in log-likelihood matter for inference, the constants may be omitted from the calculation of the log-likelihoods that Stata reports, and that, of course, would then also affect the AIC and BIC values. Since these statistics are all used based on differences between them, not on their absolute values, for that purpose it doesn't matter if you include the constant or not in the calculation. But if you are comparing a generalized gamma with, say, a gompertz or a log-normal, the omitted constants from those likelihoods are different, and consequently the comparison of log-likelihoods across those models is not valid. I don't know in which situations Stata specifically omits constant factors from the likelihood calculations, but I know that it definitely does so sometimes. So unless you are sure that you are comparing apples to apples, you should not rely on likelihood-based tests to compare different parametric models: they should only be used to compare different sets of predictors and covariates in the same parametric model.

I'm also concerned about the joint model you show at the end. If you have no censored observations, then I think it's OK. But -gsem- does not handle censoring at all.
Comment
Gianfranco Di Gennaro

Join Date: Oct 2020

Posts: 144
#8

26 Mar 2025, 02:16

Thank you, Clyde, for letting me know about the correct use of estat ic in Stata. However, in general, I always at least examine the distribution of residuals and observed/expected values, and—when Stata allows it—goodness-of-fit estimates.

Regarding gsem, I’m surprised by what you’re saying because I’ve always seen gsem used in the context of joint models with scripts very similar to mine.

What do you mean by "no censored observations"? That everyone should have experienced the event? If that were the case, there would be no need for survival analysis.

As always, thank you for your extremely valuable suggestions.

Gianfranco
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30356
#9

26 Mar 2025, 08:46

Regarding gsem, I’m surprised by what you’re saying because I’ve always seen gsem used in the context of joint models with scripts very similar to mine.

Yes, of course. I have used models similar to that myself. But I have two concerns about it in your situation. The first is that the attempt to interpret it as a mediation model is questionable--with nonlinear link functions, the interpretation of the coefficients is not so straightforward. The second is that the model cannot possibly handle the censored observations properly.

What do you mean by "no censored observations"? That everyone should have experienced the event? If that were the case, there would be no need for survival analysis.

Yes, that's right. So, I'm inferring that you do have censored observations. In that case, the -gsem- model has to be wrong.
Comment

Announcement

*Estimating Time-Varying Effects of Efficacy in a Cox Model

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment